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Abstract: 


California has recently experienced extreme drought conditions, placing a significant 
strain on the water supply and causing environmental damage. Despite recent increases in 
precipitation, the need to forecast future dry conditions for preparation and harm 
minimization remains. This paper employs a transfer learning approach to train a Long 
Short-Term Memory (LSTM) model on historical drought data. The model uses an 
autoregressive implementation for forecasting future drought conditions. The insights 
gained from these predictions suggest that California will need to take precautions to 
mitigate harm. 


Introduction: 


A. Background: 


California's water consumption is significantly higher than other states, with nearly 30 
billion gallons (about 115 billion L) used annually. A large portion of this water is used 
for agriculture, reflecting California's vital role in the nation's economy. Consequently, 
droughts not only cause water shortages for California's households but also lead to 
nationwide increases in food prices. Moreover, the dry conditions significantly increase 
the risk of forest fires. Therefore, drought is one of California's most pressing issues. 
Despite this, Californians often focus on the present drought conditions rather than future 
forecasts. Accurate drought forecasting could help California better manage its water 
supply and mitigate the impact of dry seasons. Given the cyclical and noisy nature of 
drought data, a simple regression model would not suffice. Instead, this paper uses a deep 
neural network with the LSTM architecture and a two-stage training process called 
transfer learning. This approach achieved a 98.4% accuracy rate for 1-week forecasts, 
which are extrapolated to the future with compounding errors. 


B. Question: 


This study aims to answer the question, "How can California prepare for upcoming 
drought seasons?" Therefore, it is not sufficient to merely forecast drought conditions. 
Efforts must be made to suggest water conservation measures or other strategies to 
minimize the impact of potential droughts. 


C. Hypothesis: 


This study hypothesizes that the current decrease in aridity due to high amounts of 
precipitation is temporary. Given the global warming phenomenon that is increasing 
average temperatures worldwide, future drought seasons could be more intense. As such, 


Californians must reduce their water consumption and implement more efficient 
irrigation systems to mitigate the effects of the next drought. 


Methodology: 
A. Dataset: 


The U.S. Drought Monitor provides a time-series data table with weekly data points from 
April 2000 to July 2023. Each datapoint shows the percentage of the area of the region 
which is covered in each drought severity level. Drought severity levels range from 0-4 
with a separate level indicating no drought. Shown below is a graph of California’s 
drought severity. 
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California’s dataset contained roughly 1,200 datapoints which was insufficient to train 
the colossal and complex neural network. This meant datasets from other regions across 
the nation had to be incorporated. In total, the combined datasets contained 
approximately 12,000 datapoints. 


B. Feature Selection: 


The noticeable features selected to be inputted into the model are the drought severity 
levels. In addition, a “DSCI index” calculated from the drought severity levels is used as 
an input which summarizes the strength of the drought at that time-step. Since climate 
follows a cyclical pattern throughout the year, it makes sense to utilize the date of the 
time-step. However, feeding the model a raw timestamp as an input doesn’t capture the 
cyclical nature of the yearly seasons. Instead, an approach called circular encoding is 
implemented. The date of the time-step is converted to the week of the year (1-52). The 
week is then normalized linearly between 0 and 27 and passed through a trigonometric 
function to get an encoded representation between —1 and 1. Both sine and cosine 
representations are used in juxtaposition to counter the inherent non-linearity of a 
singular trigonometric function. All in all, nine inputs are used in each time-step. The 
output layer displays six numbers corresponding to the future percentage of each drought 
severity level. The context window for the input time-steps is 104 weeks (about 2 years). 
This number was decided upon after consideration of hardware constraints. The basic 


model output predicts the drought severity level one week in the future. However, this 
timeframe is extended using a method explained in later sections. 


C. Model Implementation: 


Since the AI forecast was intended to be focused on the California region, the combined 
dataset would introduce unnecessary noise to the process. To settle this issue, this study 
uses transfer learning. In the first half of training, a smaller model would be trained on 
the nationwide dataset. During the second stage, the output layers of the model would be 
fine-tuned on the California-only dataset. This method would effectively resolve the issue 
of national drought patterns being slightly different than that of the California region 
while at the same time allowing the neural network enough data to accurately model 
general drought trends. 


Shifting the focus to model architecture, an LSTM (Long Term Short Memory) was 
utilized. LSTM networks excel at time-series prediction due to their inherent structure. 
Each historical time-step is fed through the network individually in chronological order. 
The LSTM structure contains processes which encode the prior time-steps, similar to 
human memory. If a time-step is deemed irrelevant to future predictions, it is “forgotten” 
by the network. The criterion for “forgetting” a time-step is an attribute trained through 
gradient descent. This architecture is explained further in Understanding LSTM by 
Staudemeyer and Morris. for further information about this architecture. The clever 
mechanisms contained by the LSTM network are perfectly suited to modelling our 
drought patterns which depend on chronological sequences. 


D. Hyperparameters and Reproducibility: 


The first training stage incorporates a model with a nine-neuron input layer followed by 2 
LSTM layers measuring 64 neurons each. The output layer measures six neurons and 
solely linear activation functions are utilized in this stage. The second training stage 
freezes the weights of the first two LSTM layers but adds a 3", trainable LSTM layer 
which also measures 64 neurons. A feedforward layer of 32 neurons with a RELU 
activation is incorporated before the output layer which is also retrained to fit the smaller 
dataset. The batch size for the first stage was set at 2048 with the gradients 
backpropagated every batch. The second stage did not require batch training because of 
the small dataset size. 


Results and Discussion: 


A. Training: 


The model's accuracy improvement was rapid initially but slowed down as the gradient 
approached a minimum loss point. The training was halted when the model started 
showing early signs of overfitting, which was when the training loss started decreasing 
without a corresponding decrease in the validation loss. Both stages showed similar 
learning curves, although the second stage naturally had a higher initial accuracy due to 
its "general knowledge" of the drought patterns. 
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At the end of the second-stage training, this model achieved a mean percent accuracy of 
98.33% and a standard deviation of residuals of 3.31% on the validation split for one- 
week predictions (a portion of the dataset the model was not trained on). Because the data 
itself was percentage-based, an individual error could simply be calculated via the 
difference between the expected and predicted values. Overall, the model showed 
excellent results considering the dataset was relatively small for such a complex task. 


. Inference: 


Drought is a long-term pattern with long term effects and simply knowing the next 
week’s drought levels is almost pointless. Luckily, feeding the model’s outputs back in as 
inputs can help forecast far into the future albeit at the cost of accuracy. This strategy is 
dubbed autoregression and is used extensively by sequence-generation AI. The drought 
model created in this study accepts 3 additional inputs compared to its outputs. The most 
elusive of these was the DSCI index which was the integer rounded sum of each drought 
severity index from DO through D4. The other two inputs are the sine and cosine 
encodings of the week of the year which were simple to keep track of throughout the 


autoregressive sequence. The model was only trained on sequences of length 104 which 
is why when appending the output to the input sequence, it was also required to discard 
the earliest week of data. Through autoregression of the model, a whole year of weekly 
drought predictions was generated and displayed in a similar way to the U.S drought 
monitor graphical display. Naturally, forecasts closer to the current date have a much 
higher accuracy than the ones in the far future. The percent error for a prediction can be 
approximated by the equation E ~ 100*0.983" where E is the error and n represents the 
number of the upcoming week. Note that this forecast was constructed using data up till 
July 18 and the predictions span 52 weeks (about 12 months) after this date. 
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Utilizing the same predictions, it was also possible to construct the graph below which 
displays the percentage of area in the no-drought category for the next year. The blue line 
tracks the model's predicted value while the “true value” of the prediction lies within the 
shaded region with an approximate 92% confidence. The size x of the shaded region at a 
given week is calculated using the formula x = 2 - (z*) - Vno2, where o represents the 
standard deviation of the residuals and n represents the number of the upcoming week 
and z* represents the z critical value of a 92% confidence interval. This analysis assumes 
that the distribution of residuals is normal with a mean of 0. 
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Applicability: 


A. Short-Term Measures: 


In the face of the impending drought, immediate measures are necessary. Californians 
must prioritize water conservation, which could involve stricter water usage policies and 
the promotion of water-saving technologies. In the agricultural sector, the implementation 
of more efficient irrigation systems and the promotion of drought-resistant crops can help 
optimize water use. 


B. Long-Term Goal: Desalination: 


While conservation measures are essential, they are temporary solutions. The long-term 
goal should be to develop sustainable water supply solutions. Desalination, the process of 
removing salt and other impurities from seawater, emerges as a promising solution. Yes, 
it's currently a pricey and energy-hungry process, but with technological advancements 
on the horizon, it could be our ticket to a more water-secure future. 


C. Public Education: 


Public education campaigns can raise awareness about the seriousness of the drought 
situation, the importance of water conservation, and the potential of desalination as a 
long-term solution. 


D. Conclusion: 


In conclusion, the hypothesis proposed by this study has been largely validated, 
suggesting that Californians may face challenges if they do not immediately reduce their 
high levels of water consumption and take proactive steps to prepare for the forecasted 
drought. However, the ultimate goal should be to develop sustainable water supply 
solutions, with desalination being a promising avenue to explore. 
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